14 research outputs found
Communication-Avoiding Optimization Methods for Distributed Massive-Scale Sparse Inverse Covariance Estimation
Across a variety of scientific disciplines, sparse inverse covariance
estimation is a popular tool for capturing the underlying dependency
relationships in multivariate data. Unfortunately, most estimators are not
scalable enough to handle the sizes of modern high-dimensional data sets (often
on the order of terabytes), and assume Gaussian samples. To address these
deficiencies, we introduce HP-CONCORD, a highly scalable optimization method
for estimating a sparse inverse covariance matrix based on a regularized
pseudolikelihood framework, without assuming Gaussianity. Our parallel proximal
gradient method uses a novel communication-avoiding linear algebra algorithm
and runs across a multi-node cluster with up to 1k nodes (24k cores), achieving
parallel scalability on problems with up to ~819 billion parameters (1.28
million dimensions); even on a single node, HP-CONCORD demonstrates
scalability, outperforming a state-of-the-art method. We also use HP-CONCORD to
estimate the underlying dependency structure of the brain from fMRI data, and
use the result to identify functional regions automatically. The results show
good agreement with a clustering from the neuroscience literature.Comment: Main paper: 15 pages, appendix: 24 page
Compiler Support for Sparse Tensor Computations in MLIR
Sparse tensors arise in problems in science, engineering, machine learning,
and data analytics. Programs that operate on such tensors can exploit sparsity
to reduce storage requirements and computational time. Developing and
maintaining sparse software by hand, however, is a complex and error-prone
task. Therefore, we propose treating sparsity as a property of tensors, not a
tedious implementation task, and letting a sparse compiler generate sparse code
automatically from a sparsity-agnostic definition of the computation. This
paper discusses integrating this idea into MLIR
Write-Avoiding Algorithms
Short version of the technical report available at http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-163.pdf as Technical Report No. UCB/EECS-2015-163International audienc
Recommended from our members
Communication Avoidance for Algorithms with Sparse All-to-all Interactions
In parallel computing environments from multicore systems to cloud computers and supercomputers, data movement is the dominant cost in both running time and energy usage. Even worse, hardware trends suggest that the gap between computing and data movement, both in memory systems and interconnect networks, will continue to grow. Minimizing communication is therefore necessary in devising scalable parallel algorithms. This work discusses parallelizing kernels in applications ranging from chemistry and cosmology to machine learning.We have developed new communication-avoiding algorithms for problems with all-to-all interactions such as many-body and matrix computations, taking into account their sparsity patterns, either from cutoff distance, symmetry, or data sparsity. Our algorithms are communication-efficient (some are provably optimal) and scalable to tens of thousands of processors, exhibiting orders of magnitude speedup over more commonly used algorithms. These all-to-all computational patterns arise in scientific simulations and machine learning. The last part of the thesis will present a case study of communication-avoiding sparse-dense matrix multiplication as used in graphical model structure learning. The resulting high-performance sparse inverse covariance matrix estimation algorithm enables processing high-dimensional data with arbitrary underlying structures at a scale that was previously intractable, e.g., 1.28 million dimensions (over 800 billion parameters) in under 21 minutes on 24,576 cores of a Cray XC30. Our method is used to automatically estimate the underlying functional connectivity of the human brain from resting-state fMRI data. The results show good agreement with a state-of-the-art clustering, which used manual intervention, from the neuroscience literature